payment_type = 0 (which isn't in the data dictionary: 1-6 are valid values) Missing values in all three columns (passenger_count, congestion_surcharge, airport_fee) Valid values for other monetary fields is_refund = False Most are VendorID = 2 passenger_count: Use median as it's a reasonable estimate congestion_surcharge: Use 0 as these might be exempt trips airport_fee: Use 0 as these don't appear to be airport trips payment_type: Convert 0 to 1 (credit card) since these trips have tips